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Abstract 

Energy markets and the associated energy futures markets play a crucial role in global economies. It is of great theoretical and 
practical significance to gain a deeper understanding of extreme value statistics of the volatility of energy futures traded on the 
New York Mercantile Exchange (NYMEX). We investigate the statistical properties of the recurrence intervals of daily volatility 
time series of four NYMEX energy futures, which are defined as the waiting times t between consecutive volatilities exceeding 
a given threshold q. We find that the recurrence intervals are distributed as a stretched exponential P q (f) ~ e (nT) 7 , where the 
exponent y decreases with increasing q, and there is no scaling behavior in the distributions for different thresholds q after the 
recurrence intervals are scaled with the mean recurrence interval f. These findings are significant under the Kolmogorov-Smirnov 
test and the Cramer- von Mises test. We show that empirical estimations are in nice agreement with the numerical integration results 
for the occurrence probability W q (At\t) of a next event above the threshold q within a (short) time interval after an elapsed time 
t from the last event above q. We also investigate the memory effects of the recurrence intervals. It is found that the conditional 
distributions of large and small recurrence intervals differ from each other and the conditional mean of the recurrence intervals scales 
■as a power law of the preceding interval t(to)/t ~ (ro/ff, indicating that the recurrence intervals have short-term correlations. 
Detrended fluctuation analysis and detrending moving average analysis further uncover that the recurrence intervals possess long- 
term correlations. We confirm that the "clustering" of the volatility recurrence intervals is caused by the long-term correlations well 
known to be present in the volatility. Our findings shed new lights on the behavior of large volatility and have potential implications 
in risk management of energy futures. 
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1. Introduction 



The price behaviors of oil futures and related energy futures 
play a crucial role in modern economic and financial system s 
and our everyday life (I Jones and KaulL 1 19961: ISadorskvl fl999h . 
Especially, people are more concerned with large price fluctua- 
tions than mild fluctuations for obvious reasons. Since extreme 
events are rare, it is a difficult task to infer their statistical prop- 
erties and different mathematical tools ha ve been developed to 
inves tigate the statistics of extreme values (IKotz and Nadarajahl 
1200(1 In this work, we adopt the recurrence interval analysis 
to study the behaviors of large volatility of four energy futures 
listed on the New York Mercantile Exchange (NYMEX). 

The idea of recurrence interval analysis is as follows. Con- 
sider a time series of normalized volatility v(f), as defined in 
Section [2] We are interested in the statistical properties of the 
inter-event times r (also called recurrence intervals) between 
successive "extreme" volatilities whose values are greater than 



a threshold q. Rather than working directly on those extreme 
volatilities, we obtain the statistical properties of recurrence in- 
tervals above a series of small thresholds q and try to uncover 
possible dependence of the statistical properties on the thresh- 
old q. If clear dependence is observed, we will be able to infer 
the statistical properties for extreme volatilities associated with 
large q values. 

The recurrence interval analysis has been applied to in- 
vestigate the extreme value statisti cs of time series in man y 
fields, such as rec ords of climate dBunde et all 2004 , 2005 ). 
seismic activities dSaichev and SornetteL 120061) . e nergy dis - 
sipati on rates of three-dimensional turbul ence (iLiu et al 



l2009h . heartbeat intervals in medicine science (Bogachev et al 
2009), precipitation and river runoff dBogachev an d Bunde 



uity returns (jY amasaki et al., 2006; Bogachevetal 1 120071: 
Bogachev and Bundel. 120081 12009a: RenandZhoul. l2010a : 



1 2012 ), Internet traffic dBogachev and Bundel l2009bl: ICai et al 
|2009J), finan cial volatilities dYamasaki et al. ,_ 20051), eq 
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Ludescher et all 1201 U lHe and Cheniboi lllMeng et all [20121) 
and trading volumes dPodobnik et all 120091: iRen and Zhoul 
2010bHLietalll201ll) . 



The majority of empirical studies have been carried out on 
financial volatility. In early years, it is argued that the distri- 
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bution of the recurrence intervals above a fixed threshold has a 
power-law tail: 



P q (T) 



-(1+7) 



(1) 



Specifically, iKaizoii and Kaizoiil (2004) found that the expo- 
nent y decreases from 1 .8 1 to 0.97 when the threshold increases 
from 0.1 to 0.9 for daily volatility for 800 stocks traded on 
the Tokyo Stock Exchange and decreases from 2.47 to 1.16 
when the threshold increas es from 0.05 to 0.3 for daily volatil- 
ity of the Nikkei 225 index jYamasaki et al.l (12005b reported that 
y « 1.0 for seven representati ve stocks and cur rencies which 
is independent of the threshold. iLee et al.l (|2006) observed that 
y « 1.0 for 1-min volatility of the Korean stock-market index 
KOSPI, and lGreco et al.l ( 12008b obtained a power-law distribu- 
tion for 1-min volatility of the Italian MIB30 index. 

However, the overwhelming consensus is that the recurrence 
intervals of financial volatility are distributed as a stretched ex- 
ponential: 



Pair) 



(2) 



which is supported by a handful of empirical evidence using 
daily or high-frequency da t a in developed or emerging stock 



mark ets dWang et all 120061 120071 ljung et all I2008t1biu et al. 
20081: IJeon et allbOlOtlWang and Wand, 1201 A. To our knowl 



edge, the only exception is that lZhang et al. [ (120 10b used an al- 
ternative function P q (j) ~ e _a nT ' to fit the distribution. The 
adoption of stretched exponential for modelling r ecurrence in 



terval distributions can be at least traced back to Bunde et al 



(12003b . It is also interesting to note that stretched exponen- 
tial distributions are ub i quitou s in natural and social sciences 
dLaherrere and Sornettell 1998b . 

It is important to note that the presence of scaling in the 
recurrence interval distributions is also a subtle issue. Al- 
though early works favor the presence of scaling behaviors, re- 
cent studies unveil mixed results showing that some stocks pos- 
sess scaling behaviors while others exhibit multiscaling behav- 
iors in a same market ( Wang et all l2008t iRen and Zhou 12008 



lors in a same market HW ang et aljJ/Ut 
Wang et all 120091: IRen et alll2009 a b) 



If the volatility process is Poissonian, that is, there are 
no linear or nonlinear temporal correlations in the volatility 
time series, then the recurrence intervals are exponentially 
distributed and have no lon g-term memory for any threshold 
(IKotz and N ad arai ah , l 2000h . This condition can be fulfilled 



if we shuffle the volatility time series to destroy any auto- 
correlations. There is numerical and analytical evidence ver- 
ifying that the long-term correlation of the original time se- 
ries has a remarkable influence on the recurrence interval dis- 



tribution (Bunde et al. 


. 20031 120051 1 Altmann and KantzL 12005 : 


Olla, 


2007; 


Santhanam and Kantz , 2008: Bogachev et all 


2007, 


2008 


). Moreover, the exponent y of the stretched exponen- 



tial distribution of the recurrence intervals is explicitly re- 
lated to the autocorr e lation exponent of the origi n al time series 



4- 



Al 



(Bunde et all 120041: 
120051: iLivina et al 112005b. 



mann and Kantzl 2005 : Bunde et al 



The remainder of this paper is organized as follows. Sec- 
tion|2]describes the data sets of NYMEX energy futures prices 



we shall investigate and the basic properties of the recurrence 
interval time series. Section [3] investigates the probability dis- 
tributions of the recurrence intervals of the volatility. We find 
that there is no scaling in the distribution and the empirical dis- 
tributions can be well fitted by stretched exponentials. Section 
[4] investigates the memory effects of the recurrence intervals. 
We show that there are both short-term and long-term correla- 
tions in the recurrence interval time series. We summarize our 
findings in Section|5] 

2. Data description 

We retrieve the daily prices of four NYMEX futures of crude 
oil (Light-Sweet, Cushing, Oklahoma), reformulated regular 
gasoline (New York Harbor), No. 2 heating oil (New York Har- 
bor), and propane (Mont Belvieu, Texas). The prices of crude 
oil are in dollars per barrel, while all others in dollars per gal- 
lon. The raw data sets are downloaded from the web site of 
the U.S. Energy Information Administration. For each energy's 
futures, we choose "contract 1" for analysis. The time periods 
of the records are 4 April 1983 - 2 October 2012 for crude oil, 
3 October 2005 - 2 October 2012 for gasoline, 2 January 1980 
- 2 October 2012 for heating oil, and 17 December 1993 - 18 
September 2009 for propane, each containing 7401, 5512, 8214 
and 3941 data points. 
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Figure 1: (Color online.) The upper panel illustrates the normalized volatility 
v(r) of the daily crude oil futures prices and the lower panel shows the recur- 
rence intervals r between successive normalized volatilities that are larger than 
a threshold q = 2. 

Denote Y(t) the futures price at time t. The volatility is de- 
fined as the absolute value of the logarithmic return 

R(t) = \lnY(t)-]nY(t-l)\, (3) 

and the normalized volatility is determined as follows: 

m = m (4 ) 

u [</?(0 2 >-<fl(0> 2 ] 1/2 ' (> 

The upper panel of Fig.[T]illustrates the time series of the nor- 
malized volatility of the crude oil. There is clear evidence 



2 



of the volatility clustering pheno menon indicating long-term 
correlations well docum ente d in Tabak and Caiueirol ( 2007 ). 
Elder and Serletisl d2008l) and lCunado et al.1 d2010h . 

The recurrence intervals t are then determined as the waiting 
times between successive volatilities exceeding a given thresh- 
old q. Assume that the volatility at day t is greater than q: 
v(t) > q. Then the recurrence interval is 



r(f) = min{f' - t : v(f') > q,f > t). 



(5) 



The lower panel of Fig. Q] shows the recurrence interval time 
series obtained for q = 2, where each recurrence interval r(f) 
is drawn at the time t of the first volatility v(t). During volatile 
time periods when large volatilities cluster, the recurrence inter- 
vals are small and dense. In contrast, during calm periods with 
small volatility, the recurrence intervals are large and sparse. 
These observations are consistent with the fact that the tempo- 
ral structure of the original volatility time series has significant 
impacts on the temporal structure and distribution of the recur- 
rence interval time series. 

3. Empirical probability distributions 

3.1. Tests of scaling in scaled probability distributions 

For each threshold q, we can obtain a sequence of recurrence 
intervals for each volatility time series. The empirical proba- 
bility distribution of the sequence of recurrence intervals can 
be determined for each q. Figure [2ja) illustrates the empirical 
probability distributions P q (j) of the volatility recurrence inter- 
vals t associated with different thresholds q for the WTI crude 
oil futures. Similarly, the empirical PDFs for gasoline, heat- 
ing oil and propane futures are shown in Fig. |2{b-d). With the 
increase of the threshold there are more large recurrence inter- 
vals and less small recurrence intervals. It is also found that all 
the curves in Fig. |2|a-d) for different thresholds and different 
equities have a similar shape. 

We are interested in the presence of poss i ble sc aling behav- 
iors in the PDFs. Following lYamasaki et al.l (12005b . we investi- 
gate the scaled recurrence intervals r/f, whose probability dis- 
tribution fq(j/f) is as follows 



/,(T/f) = P q (j)f, 



(6) 



where f is the mean return interval that depends on the thresh- 
old q. The scaled distributions are presented in Fig.|2je-h). We 
do observe that the curves of f q (r/f) have a better collapsing 
than the original curves of P q (j). However, it is not clear if there 
is a scaling behavior. If we shuffle the volatility time series, the 
resulting recurrence intervals are exponentially distributed for 
any threshold, as depicted in the lower panel of Fig. [2] 

Theoretically, if fJj/f) is independent of q, the scaled PDFs 
are regarded to exhibit scaling behaviors. Expressing alterna- 
tively, f q has a scaling behavior if and only if for any pair of 
qi and qj the two corresponding sequences of recurrence inter- 
vals have the same distribution, that is, f q . — f q . . In this work, 
we adopt the well established Kolmogorov-Smirnov (KS) test 
to check if f q . = f qj . 



The standard KS test is designed to test the hypothesis that 
the distribution of the empirical data is equal to a particular dis- 
tribution by comparing their cumulative distribution functions, 
which can also be applied to test if two samples have the same 
probability distribution. Denote F qi the cumulative probability 
function of f qr We calculate the KS statistic by comparing the 
two CDFs in the overlapping region: 



KS = max(|F ?l - F q .\j , q t + 



Qj- 



(7) 



When the KS statistic is less than a critical value CV, the hy- 
pothesis is accepted. The critical value is 



CV = c a y(m + n)/mn, 



(8) 



where m and n are the numbers of recurrenc e interval samples 
for q, and qj (lDarUngLll957tlStephensll 19741) . a nd the threshold 
is c„ — 1.36 a t the significance level of a — 5% ( Smirnovlll948l : 



Youngl 119771) . The null hypothesis is that the two samples are 
drawn from the same distribution and it is rejected if KS > CV 
at the significance level of 5%. 

We have performed KS tests for each pair of samples (r q ., T q ) 
for the four futures. The results are presented in Table [1] One 
can find that KS is much greater than CV in all cases. It means 
that there is no scaling in the distribution of recurrence intervals 
for different thresholds q. 



Table 1 : Two-sample Kolmogorov-Smirnov test of possible scaling behaviors in 
the return interval distributions by comparing the statistic KS with the critical 
value CV at significance level of a = 5%. The null hypothesis that the two 
samples have the same distribution is rejected if KS > CV. 

Crude oil Gasoline Heating oil Propane 



qi qj KS CV KS CV KS CV KS CV 



1.0 1.2 

1.0 1.4 

1.0 1.6 

1.0 1.8 

1.0 2.0 

1.2 1.4 



1.2 
1.2 



1.6 
1.8 
1.2 2.0 
1.4 1.6 
1.4 1.8 
1.4 2.0 
1.6 1.8 
1.6 2.0 



0.34 0.04 
0.30 0.04 
0.27 0.05 
0.42 0.05 
0.39 0.06 
0.30 0.05 
0.27 0.05 
0.25 0.05 
0.39 0.06 
0.27 0.05 
0.25 0.06 
0.23 0.06 
0.25 0.06 
0.23 0.06 



0.36 0.04 
0.30 0.05 
0.26 0.05 
0.40 0.06 
0.47 0.06 
0.30 0.05 
0.26 0.05 
0.24 0.06 
0.37 0.06 
0.26 0.06 
0.24 0.06 
0.22 0.06 
0.24 0.06 
0.22 0.07 



1.8 2.0 0.23 0.07 0.22 0.07 



0.35 
0.30 
0.26 
0.41 
0.47 
0.30 
0.26 
0.23 
0.38 
0.26 
0.23 
0.38 
0.23 
0.22 
0.19 



0.04 
0.04 
0.05 
0.05 
0.05 
0.04 
0.05 
0.05 
0.06 
0.05 
0.05 
0.06 
0.06 
0.06 
0.06 



0.33 0.06 
0.30 0.07 
0.26 0.07 
0.41 0.08 
0.49 0.08 
0.30 0.07 
0.26 0.07 
0.26 0.08 
0.40 0.09 
0.26 0.08 
0.26 0.08 
0.25 0.09 
0.26 0.09 
0.25 0.09 
0.25 0.10 



3.2. Fitting the PDFs 

We use the stretched exponential, expressed in Eq. (f2| to 
fit the empirical recurrence interval distributions of financial 
volatility. Considering Eq. (O, we have 



/,(*) = m 



-(afxf 



(9) 




10"' 10° io' 10 2 10"' 10° io' 10 2 10"' 10° io' 10 2 10 1 10° io 1 io 2 



t/t t/t t/t t/t 

Figure 2: (Color online.) (Top panel) Empirical probability distributions P q (r) of the volatility recurrence intervals r with different thresholds q for the four futures: 
(a) Crude oil, (b) Gasoline, (c) Heating oil, and (d) Propane. The solid lines are the best fits to the stretched exponential function. (Bottom panel) Scaled probability 
distributions P f/ (r)f of the volatility recurrence intervals t/t with different q values for the four futures: (e) Crude oil, (f) Gasoline, (g) Heating oil, and (h) Propane. 
Also shown in plots (e-h) are the scaled probability distributions for the shuffled volatility time series, which have been shifted downwards for clarity. 



where x — t/t, c and a are two parameters and y is the correla- 
tion exponent characterizing the long-term memory of volatil- 
ities. Due to t he definition of x, the parameters a and c ar e 
dependent of y dAltmann and Kantzl 120051: IWang et all 120081) . 
The normalization condition states that 



1 



dx. 



(10) 



Ux)dx = cf e- {afxy 
Jo Jo 

Let y = (afx) y and notice that C f~ l e~'dt = T(z). We have 

cr(l/y) = ay. (11) 
In addition, since the mean of x is 1 by definition we have 



1 



f 

Jo 



xf q {x)dx — ct I xe 



f 

Jo 



-(afxy 



dx. 



Similarly, we obtain 

cTdlyf =ar(2/r). 
Solving Eqs. (fTTT i and (fT3T l. we have 

a = r(2/r)/r(i/ T ) 

and 

c = r r(2/r)/r(i/ r ) 2 . 



(12) 



(13) 



(14) 



(15) 



Therefore there is only one free parameter y in the fitting of the 
normalized recurrence intervals x = t/t. 

However, the situation is more complicated. First of all, 
the recurrence intervals are integers and t is thus discrete. It 
means that the derivation above is not accurate and might be 
biased in capturing the "true" distribution. In addition, in most 
cases, one can fit on l y part of the distribution using chosen func- 



tions (IClauset et aUl2009l) . Hence, we use the three-parameter 



that are longer than some minimal value T m j n . The idea of the 
method is the same as in Clauset et al. I (l20Q9t) and iJiang et al 
d2013al) . which we describe below. 



The approach is based on the maximum likelihood estima- 
tion (MLE) and KS tests. We assume that the intervals larger 
than a truncated value T m j n are described by the three-parameter 
stretched exponential in Eq. ©. We also need to determine 
the lowest boundary r m j n as an additional parameter. A trun- 
cated sample is obtained by discarding the recurrence inter- 
vals less than r m j n in the original interval sample. Following 



Clauset et al.l (120091) . we search for the distribution parameters 
a, c and y of the truncated sample using MLE and the associated 
KS statistic for different lowest boundaries. The optimal T m i n is 
determined as the one corresponding to the truncated sample 
with the smallest KS value. 

The optimal value of T m i n and the corresponding estimates 
of a, c and y for the samples of recurrence intervals above six 
thresholds of the four N YMEX energy futures contracts are pre- 
sented in Table |2] We notice that only a few smallest recurrence 
intervals are excluded in the fitting and the stretched exponen- 
tial fits most of the data points. For parameter a, no clear de- 
pendence on q can be observed. Interestingly, we find that both 
c and y exhibit a decreasing trend with increasing threshold q, 
except that the case of heating oil with q — 1 .4 is an outlier. The 
decreasing trend of y is also observed for stock prices and stock 
indices when there is no scaling behavior in P q (j) and the val- 
ues of y of the e nergy futures are close to those of stock prices 



or stock indices (Ren and Zhou, 2008; Ren et al. i l2009ah . 



stretched exponential in Eq. © to fit the recurrence intervals 



3.3. Goodness-of-fit 

For each sample, we use the one-sample Kolmogorov- 
Smirnov test and Cramer-von Mises (CvM) test to assess the 
goodness-of-Fit after the optimal T m ; n and the corresponding 
distribution parameters are obtained. The null hypothesis Ho 



4 



Table 2: Estimates of r ra j n , a, c and y of the stretched exponential expressed in 
Eq. {2) using maximal likelihood estimation and the p- values of the goodness- 
of-fit tests using KS statistic and CvM statistic. 



Futures 





Miiin 


c 


(X 


y 


/If Q 


/-'CvM 


Crude oil 


1.0 


2 


37.24 


37.04 


0.35 


0.14 


0.23 




1.2 


2 


23.04 


38.50 


0.33 


0.11 


0.08 




1.4 


3 


24.88 


37.95 


0.32 


0.11 


0.26 




1.6 


2 


10.66 


40.00 


0.30 


0.21 


0.18 




1.8 


1 


2.82 


14.35 


0.32 


0.26 


0.32 




2.0 


1 


3.16 


25.10 


0.28 


0.23 


0.59 


Gasoline 


1.0 


3 


98.64 


33.75 


0.37 


0.59 


0.64 




1.2 


3 


50.15 


33.35 


0.35 


0.46 


0.66 




1.4 


3 


35.15 


38.40 


0.33 


0.15 


0.37 




1.6 


3 


19.24 


32.95 


0.32 


0.35 


0.33 




1.8 


2 


4.88 


14.60 


0.33 


0.63 


0.50 




2.0 


2 


5.82 


31.85 


0.28 


0.47 


0.43 


Heating oil 

± til. Ill _^ Ull 


1.0 


2 


30.26 


33.25 


0.35 


0.31 


0.18 




1.2 


3 


38.10 


40.00 


0.33 


0.21 


0.29 




1.4 


1 


1.77 


3.50 


0.43 


0.09 


0.06 




1.6 


2 


10.66 


40.00 


0.30 


0.48 


0.37 




1.8 


2 


8.04 


40.00 


0.28 


0.09 


0.46 




2.0 


2 


6.05 


40.00 


0.28 


0.23 


0.18 


Propane 


1.0 


2 


23.41 


38.85 


0.33 


0.83 


0.65 




1.2 


2 


16.55 


37.25 


0.32 


0.68 


0.61 




1.4 


1 


3.78 


14.00 


0.34 


0.62 


0.45 




1.6 


1 


2.96 


13.05 


0.33 


0.59 


0.42 




1.8 


1 


4.33 


39.25 


0.28 


0.31 


0.84 




2.0 


1 


3.49 


40.00 


0.27 


0.14 


0.23 



for our KS test and CvM test is that the left-truncated data 
(r > r m ; n ) are drawn from the fitted stretched exponential dis- 
tribution. The methods are described below, which are similar 



but not the same as in lRen and Zhoul (|2008). 

In the one-sample situation, the KS statistic defined in Eq. (0) 
for the real sample of recurrence intervals is calculated as fol- 
lows 



KS = max(\F q - F S e\) , 



(16) 



where Fse - Jq ce~ <flT) 7 dr is the cumulative distribution of 
P 9 (t) and the parameters a, c and y are associated with the 
given q. Then the bootstrapping approach is adopted to deter- 
mine the distribution of the KS statistic, p{KS s i m ). We generate 
10000 synthetic samples from the best fitted distribution and 
calculate the KS statistic for each synthetic sample in reference 
to the fitted distribution as follows 



KS sim = max(|F sim - F SE |) ■ 



(17) 



The p-value is determined by the proportion that KS s [ m > KS , 
which is the area enclosed by three curves p{KS s i m ), KS s j m = 
KS , and p{KS s ; m ) = 0. Hence, a small KS value corresponds 
to a large /9-value and high goodness-of-fit. 

Figure [3] illustrates the results of the goodness-of-fit tests 
using KS statistic for crude oil futures. The six distributions 



p(KS^ m ) have similar shapes. With the increase of q, the dis- 
tribution becomes broader and most probable value that corre- 
sponds to the maximum of the distribution increases. The es- 
timated p-values are presented in Table [2] all greater than 5%. 
The results for gasoline, heating oil and propane are also listed 
in Table [2] The minimal p-value is given by the recurrence in- 
tervals of heating oil with q = 1 .4, which is identified as an 
outlier due to its abnormally large value of y. We conclude that 
the stretched exponential can be used to model the recurrence 
interval distributions under the KS tests. 




Figure 3: (Color online.) Goodness-of-fit tests of the stretched exponential 
distributions using Kolmogorov-Smirnov statistic for different thresholds q for 
crude oil futures. In each plot, the black solid curve stands for the probability 
distribution p(KS s \ m ) and the red dashed line is the value of KS for the real 
sample. The p-value is the area enclosed by p(KS s i m ), KS s ; m = KS, and 
p(KS SIm ) = 0. 



As shown i n Pearso n and Stephens! d 19621) . IStephensI d 19641) 
and IStephensI (119701) . the CvM statistic is determined as fol- 
lows: 



W 



{F q - Fse) dFsE, 



(18) 



where F q is the CDF of empirical data for threshold q and is 
the total number of the recurrence interval samples. Then the 
bootstrapping approach is adopted to determine the distribution 
of the CvM statistic, p(W^ im ). We generate 10000 synthetic 
samples from the best fitted distribution and calculate the CvM 
statistic for each synthetic sample in reference to the fitted dis- 
tribution as follows: 



wi 



N 



o 



(Fsi m - Fse) dFsE, 



(19) 



The />value is determined by the proportion that W 2 - > W , 
which is the area enclosed by three curves p(W 2 ), W 2 ™ = W 2 , 

J 1 v sim'' sim ' 

and p(W ■ ) = 0. Hence, a small W value corresponds to a 
large p-value and high goodness-of-fit. 

Figure |4]illustrates the results of the goodness-of-fit tests us- 
ing CvM statistic for crude oil futures. The six distributions 
p(W 2 im ) have very similar shapes and the quantitative differ- 
ences between them are much smaller than in Fig. [4] The es- 
timated p-values are also presented in Table [2] as well as the 
results for gasoline, heating oil and propane. We find that all 
the p-values are greater than 5% and the minimal p-value is 6% 
for the outlier. Hence, the distributions of recurrence intervals 
above a fixed threshold q can be well fitted by stretched expo- 
nentials. 






— q = 1.2 




W 2 




Figure 4: (Color online.) Goodness-of-fit tests of the stretched exponential dis- 
tributions using Cramer-von Mises statistic for different thresholds q for crude 
oil futures. In each plot, the black solid curve stands for the probability distri- 
bution p(W 2 - ) and the red dashed line is the value of W 2 for the real sample. 
The p-value is the area enclosed by p(W 2 ),W 2 = KS , and p(W 2 ) = 0. 



3.4. Risk assessment 

Consider the situation that f trading days have elapsed since 
the last large volatility greater than q. We are interested in the 
hazard probability that a new volatility greater than q will occur 
within Af trading days. The hazard probability gives a quantita- 
tive estimation of risk. Mat hematically, the hazard probability 
can be expressed as follows (IBogachev et all 120071) : 

P q (t)dt 



W Q {At\t) 



fp q (t)dt 



(20) 



Since each distribution P q (t) has been fitted with a stretched 
exponential with the parameters given in Table [2] we can ob- 



tain the theoretical function of hazard probability W q {At\t) for a 
given value of At through numerical integration. 

In order to determine the W q {At\t) values empirically, we can 
rewrite Eq. ( f2Qb in the following way: 



W q (At\t) 



#(t <T q <t + At) 

Wja > t) ' 



(21) 



where the denominator #(r 9 > f) is the number of recurrence 
intervals greater than t and the numerator #(f < r q < t + At) is 
the number of intervals greater than t and not greater than t + At 
for a given q. 

Figure [5] illustrates the dependence of the hazard probabil- 
ity Wq(At\t) on t when fixing Af = 1 for crude oil futures. 
It is observed that the theoretical curves approximate the em- 
pirical curves very nicely. It is very important to notice that 
the discrepancy between theoretical and empirical curves de- 
creases when q increases. Another intriguing observation is that 
W q (At\t) decreases with increasing f, which is simply a conse- 
quence of the fact that recurrence intervals exhibit clustering 
behaviors as shown in Fig. |TJb) or long-term correlations that 
will be confirmed in Section [4] We also observe that, for small 
t values, 



W qi (At\t) > W q2 (At\t), 



(22) 



if qi > qi- This is consistent with the intuition that the hazard 
probability decreases when the condition approaches to extreme 
and there are less recurrence intervals. 
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Figure 5: (Color online.) Comparison of W c/ (At = l|f) estimated empirically 
using Eq. 12 U and numerically using Eq. )20t for different q values for crude 
oil futures. The black circles and red lines correspond respectively to empirical 
and numerical results. 

We have also studied two other cases in which At - 5 and 
At = 10. The results are qualitatively the same as for At = 1. 
A new finding is that W q (At\t) increases with Af for given t and 
q, which is actually trivial and can be derived from Eq. (1201 di- 
rectly. In general, W q (At\t) is a monotonously increasing func- 
tion of Af. All these findings apply to other three energy futures 
investigated in this work. 
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4. Memory effects 

Previous works have reported that the recurrence intervals of 
stock and stock index volatility possess both short-term mem- 
ory and long-term memory. It is interesting to investigate the 
memory effects of volatility recurrence intervals of energy fu- 
tures pri ces. Note that the meth ods used in this work were pro- 
posed bv lYamasaki et al.l (120051) . 



4.1. Short-term memory 

To investigate possible short-term correlations in the recur- 
rence intervals, we compute and compare the conditional prob- 
ability distributions P ci {t\tq), which is the distribution of recur- 
rence interval t conditioned on the value of its preceding recur- 
rence interval tq. If there is no short-term memory, P c/ (t\tq) is 
independent of to. However, it is hard to determine P 9 (t|to) for 
a single value of to since the size of the interval sample is not 
sufficiently large. We thus adopt an alternative approach, which 
employs the idea of coarse graining. 

For a given q, the set T of all recurrence intervals is parti- 
tioned into four non-overlapping subsets: 



T = TiUT 2 UT 3 U T 4 , 



(23) 



where T, n Tj ■ = <t> for ; + j. In the partitioning procedure, all 
recurrence intervals in T are sorted in an increasing order and 
then assigned in turn into Ti, T2, T3 and T4 such that their sizes 
are approximately identical. We estimate the empirical condi- 
tional probability density functions P (/ (t|T,) = P 9 (t|to e T,) of 
recurrence intervals that immediately follow a recurrence inter- 
val to belonging to T,. If recurrence intervals have no short- 
term memory, we should find that P 9 (t|T,) = P q (r\Tj) for any 
i * j. 
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Figure 6: Conditional distributions P 9 (r|T, ) of recurrence intervals in the small- 
est set Ti (filled symbols) and the largest set T4 (open symbols) for different 
thresholds q for the four energy futures: (a) Crude oil, (b) gasoline, (c) heating 
oil, and (d) propane. 

The results for Ti and T4 are illustrated in Fig. [6] We 
find that there is no scaling in the distributions P 9 (t|T,) in re- 
gard to different q value for any T, set. It is also evident that 



P q (r\T \)±P q (j\Y 4) and small (large) recurrence intervals are 
more probably followed by small (large) recurrence intervals, 
indicating that there is short-term memory in the recurrence in- 
terval time series. 
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Figure 7: (Color online.) Scaled conditional means f(Tfj)/f of recun'ence inter- 
vals in eight subsets for different thresholds q for the four energy futures: (a) 
Crude oil, (b) gasoline, (c) heating oil, and (d) propane. Black filled circles and 
red open circles correspond respectively to the results of the original volatility 
sereis and the shuffled volatility series. 



We further investigate the mean of the conditional recurrence 
intervals t(to). Each whole sample for a given q is divided into 
eight subsets and each subset is characterized by its mean re- 
currence interval to. The conditional mean t(to) is then deter- 
mined for each subset. For every q, the scatter plots of f(ro)/f 
against to/t for the four energy futures are shown in Fig. [71 
where f is the mean of the whole sample of recurrence intervals 
for q. Hence, each plot in Fig. UJ contains 8x6 points, where 
"8" is the number of subsets for each q and "6" is the number 
of q values. We observe a nice power-law dependence 



f (T )/f ~ (T /ff 



(24) 



with a positive power-law exponent /3, which also holds for 
stock volatility. In contrast, when we shuffle the original volatil- 
ity time series and repeat the above procedure, the conditional 
means are constant as shown in Fig.|7j i.e., t(tq) = f for each 
q. Therefore, Fig. UJ further confirms the presence of short-term 
correlations in the recurrence intervals. 

4.2. Long-term memory 

To investigate the possible long-term correlations in the re- 
currence intervals, we adopt the detrended fluctuation analy- 
sis (DFA) and the detrending moving average (DMA) analysis, 
which are regarded as "The M ethods of Choice" in determining 
the Hurst index of time series (iShao et all 120121) . Indeed, a lot 



of numerical experiments have unveiled that the performance 
of the DMA method are comparable to the D FA method with 



slightly differences under different si tuations dXu et al 
Bashan et al.ll2008l;IShao et alll2012h. 



2005 
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Figure 8: Detrended fluctuation analysis (top panel) and detrending moving average analysis (bottom panel) of the recurrence intervals of energy futures price 
volatility: (a,e) Crude oil, (b,f) gasoline, (c,g) heating oil, and (d,h) propane. Only three thresholds q = 1.0, 1.2 and 1.4 are considered, since larger thresholds result 
in shorter recurrence interval time series. In the plots of the bottom panel, the detrending moving average analyses with 8 = (backward DMA), 8 = 0.5 (centred 
DMA) and 8 = 1 (forward DMA) have been adopted. 



The DFA approach was invented by iPeng et alJ (119941) to 
investigate the long-range dependence in DNA nucleotide se- 
quences. The pr o perties of DFA have been extensively s tudied 



(IHu et al 



2001 



Chen etall 120021 120051: IShang et all 12009 



Xu et al.Ll2009t Ma et all 1201 Oh . The DMAapproach is based 



p. 

on the mo ving average technique (Carbone , |2009[) and de 
yelope d by Vandewalle and Ausloos (fl998l) and Alessio et al 



d2002l) . The DMA method has been widely applied to the 
analy sis of real-worl d time s eries ( Carbone and Castelli, 



Serletis and Rosenberg, 2007 



20031: ICarbone et all l2004tfl IVarotsos et alT 



Arianos and Carbona 



Matsushita et all l2007t ISer letis and Rosenberg, 2009) and 



synthetic sign als ( Carbone and Stanlevi 2004 : Xu et al 
SerletisLl2008l) 



2005 



2007 



2005 



The DFA and DMA methods are briefly described below. In 
the first step, the cumulative summation series y,- is determined 
as follows 



yi = Yj - < T >] ■ i = l ' 2 > 



,N, 



(25) 



where (r) is the sample mean of the return series. Next, we 
determine the local trend y,(s). The residual sequence e, is ob- 
tained by removing local trend function from y t : 



= yt - yi(s). 



(26) 



We can then calculate the overall fluctuation function F(s) as 
follows 



1 N 

[F(s)f = - J] [€i(s)] 

( = 1 



(27) 



As the box size s varies in the range of [20, N/4], one can deter- 
mine the power law relationship between the overall fluctuation 
function F(s) and the box size s, 

F(s) ~ s H , (28) 



where H signifies the DFA or DMA scaling exponent. It is nec- 
essary to emphasize that the scaling exponent H obtained from 
Eq. ( I2B1 should be called "DFA exponent" or "DM A exponent" 



rather than "Hurst index" in the rigorous sense ( IJiang et al 
2013bh . 



The main difference between the DFA and DMA algorithms 
is the determination of the "local trend function" which is 
dependent of box size s. In the DFA algo rithm, the local tr end 
is determined by polynomial fits in boxes jPeng et all 1 19941) . In 
the DMA approach, one calculates the moving averag e function 
y, in a moving window (lArianos and Carbonel 120071) . 



1 X yi ~ k > 



(29) 



/<=-/,„ 



where k B = [(s - l)6\, k F = \(s - 1)(1 - 6)~\, s is the win- 
dow size, [x\ is the largest integer not greater than x, fx] is the 
smallest integer not smaller than x, and 9 is the position pa- 
rameter with the value varying in the range [0, 1]. Hence, the 
moving average function considers \{s- 1)(1 -8)~\ data points in 
the past and l(s - l)f?J points in the future. Three special cases 
are adopted in this paper. The first case 8-0 refers to the back- 
ward moving average resulting in the backward DMA analysis 
(BDMA), in which the moving average function ^ is calculated 
over all the past 5-1 data points of the signal. The second case 
6 = 0.5 corresponds to the centered moving average resulting 
in the centred DMA analysis (CDMA), where y, contains half 
past and half future information in each window. The third case 
9 — 1 is called the forward moving average resulting in the for- 
ward DMA analysis (FDMA), where y, considers the trend of 
s - 1 data points in the future. 

Figure [8] shows the power-law dependence of the fluctuation 
function F(s) with respect to the box size s. Note that only 
three thresholds q = 1 .0, 1 .2 and 1 .4 are considered, since larger 
thresholds result in shorter recurrence interval time series. With 
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Table 3: Estimated DFA and DMA exponents of the recurrence intervals with q = 1.0, 1.2 and 1.4 for the four futures, which are the slopes of the power laws in 
Fig.[8]obtained with linear least-squares regressions. N q is the size of the recurrence interval sample for q. 







Crude oil 






Gasoline 




Heating 


oil 




Propane 




q 


1.0 


1.2 


1.4 


1.0 


1.2 


1.4 


1.0 


1.2 


1.4 


1.0 


1.2 


1.4 


N q 


2446 


1893 


1498 


2073 


1644 


1333 


2674 


2115 


1672 


1111 


884 


705 


DFA 


0.73 


0.75 


0.75 


0.78 


0.78 


0.80 


0.72 


0.72 


0.70 


0.71 


0.69 


0.72 


BDMA 


0.67 


0.68 


0.65 


0.74 


0.74 


0.75 


0.68 


0.66 


0.66 


0.53 


0.49 


0.43 


CDMA 


0.79 


0.80 


0.80 


0.80 


0.81 


0.86 


0.61 


0.67 


0.67 


0.56 


0.56 


0.56 


FDMA 


0.71 


0.68 


0.64 


0.76 


0.75 


0.75 


0.67 


0.68 


0.67 


0.52 


0.53 


0.51 



the increase of q, the length of the recurrence interval time se- 
ries decreases and the scaling range becomes narrower. A com- 
parison of the four corresponding power-law curves of DFA, 
BDMA, CDMA and FDMA for each recurrence interval time 
series unveils that some CDMA curves exhibit low-frequency 
trends around the power laws, and BDMA and CDMA curves 
have smaller fluctuations around the power laws than DFA 
curves. It is noteworthy to mention that crossover phenom- 
ena have been observed in the DFA fluctu a tion functions o f 
recurre nce intervals for stocks dWang et al. , 2006; iRen et al 



2009albl) , but not for the energy futures studied in this work. 

We fit the power-law curves in Fig. [8] using linear least- 
squares regressions between ln,F(.s) and In s. The estimated 
DFA and DMA exponents are presented in Table [3] For each 
futures contract and for each DFA/DMA estimator, no evident 
dependence of the scaling exponent H on the threshold q is ob- 
served. For crude oil and gasoline, we find roughly that 



^CDMA > HuFA > HbdMA ~ #FDMA- 

For heating oil, we have 



DFA 



>H 



BDFA 



He 



DMA 



FDMA- 



For propane, we find 

Hdfa > Hqdfa > Hbdma ~ Hfuma- 



(30) 



(31) 



(32) 



Despite of these discrepancies, there is no doubt that the recur- 
rence intervals of crude oil, gasoline and heating oil futures pos- 
sess long-term correlations. However, the results for propane 
are ambiguous: The DFA approach suggests the presence of 
long-term correlations, while the DMA methods do not. 

We have also performed DFA and DMA on the recurrence 
interval time series of the shuffled volatility series. We find 
that the estimated DFA and DMA exponents are close to 0.5. 
It means that the long-term correlations in the recurrence in- 
tervals originate from the long-term correlation in volatility. 
This finding is the sa me as for stock and index volatilities 
dYamasaki et al.ll2005l) . 



time series for four NYMEX energy futures (crude oil, gaso- 
line, heating oil and propane), attempting to understand the be- 
haviors of large volatilities. Specifically, the distributions and 
memory effects of recurrence intervals are investigated. 

We found that there is no scaling behavior in the distribu- 
tions for different thresholds q after the recurrence intervals are 
scaled with the mean recurrence interval f, which has been ver- 
ified by the two-sample Kolmogorov-Smirnov test. We also 
found that the recurrence intervals are distributed as a stretched 
exponential P C] {t) ~ e (flT) Y and the exponent y decreases with 
increasing q, which are significant under the Kolmogorov- 
Smirnov test and the Cramer-von Mises test. We showed that 
the empirical estimations are in nice agreement with the numer- 
ical integration results for the occurrence probability W q (At\t) 
of a next large volatility above the threshold q within a short 
time interval after an elapsed time t from the last large volatil- 
ity above q. 

We also investigated the memory effects of the recurrence in- 
tervals. We found that the conditional distributions of large and 
small recurrence intervals differ from each other and the con- 
ditional mean of the recurrence intervals scales as a power law 
of the preceding interval t(to)/t ~ {ro/ff with f3 > 0, indicat- 
ing that the recurrence intervals have short-term correlations. 
Detrended fluctuation analysis and detrending moving average 
analysis further uncover that the recurrence intervals possess 
long-term correlations. We confirmed that the "clustering" of 
the volatility recurrence intervals is caused by the long-term 
correlations well known to be present in the volatility. 

Our findings shed new lights on the behavior of large volatil- 
ity and have potential applications in risk estimation of energy 
futures. Theoretically, these findings have important implica- 
tions in the pricing and risk management of futures' derivatives 
such as options. 
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